Skip to content

Conversation

@TheAssembler1
Copy link
Collaborator

No description provided.

@jeanbez
Copy link
Member

jeanbez commented Aug 19, 2025

We discussed adding a link to the PDC performance dashboard in Perlmutter.

@jeanbez
Copy link
Member

jeanbez commented Sep 15, 2025

@TheAssembler1 can you solve those conflicts?

@TheAssembler1
Copy link
Collaborator Author

@jeanbez synced with develop but there were no conflicts.

@jeanbez jeanbez added type: documentation Improvements or additions to documentation priority: high High priority labels Sep 16, 2025
@jeanbez jeanbez added this to the v.0.7 milestone Sep 16, 2025
@sbyna sbyna self-requested a review November 14, 2025 16:05

To view an exhaustive list of compile-time options please see :ref:`compile_time_options`.

PDC offers the following installation targets:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these three options referring to installation targets or saying that PDC can be accessed using 3 APIs. I think it's latter and we need to give some background to APIs.

Copy link
Collaborator Author

@TheAssembler1 TheAssembler1 Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is the two APIs (C/Python) and the VOL connector. I will add some background information before this.

data objects to find desired data efficiently as well as to store information in the data objects.

PDCs will have an impact in many science areas, given the importance of the data management and I/O software stack in achieving science discoveries at scale. The foundations of the novel data management and storage paradigm approaches and formalisms proposed in this research are expected to be applicable to a broad range of scientific and engineering problems that utilize computational and experimental facilities for predictive understanding of physical processes through data analytics and visualization. The proposed techniques are expected to accelerate the crucial process of data-driven exploration and knowledge discovery. While we will work closely with a set of key DOE science applications in the areas of cosmology, climate, genomics, and high-energy density physics to evaluate our research, the proposed new I/O paradigm will be broadly applicable to all users of DOE HPC facilities.
More information and publications about PDC are available at https://sdm.lbl.gov/pdc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should create a page on the IDT lab website and tag PDC publications.

Soumagne, Jerome, Vishwanath, Venkat, Warren, Richard, and Tessier, François.
*Proactive Data Containers (PDC) v0.1*. Computer Software. https://github.com/hpc-io/pdc.
USDOE. 11 May. 2017. Web. doi:`10.11578/dc.20210325.1 <https://doi.org/10.11578/dc.20210325.1>`_

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added text above that would replace this part.

:alt: PDC Installation Diagram
:align: center
:class: bordered-image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about start with a different section for this image and explain it with some more content. Then, we can go to the Installation process.

**7.3.** Short-Term Goals (Next 6-12 Months)
--------------------------------------------

- Implement enhanced **data caching and eviction policies** to reduce I/O latency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implement enhanced client-side data caching and eviction policies to automatically exchange data from other MPI ranks and from multiple nodes.

**7.4.** Medium-Term Goals (1-2 Years)
--------------------------------------

- Introduce **multi-tier data management** (memory, SSD, disk, object storage).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integrate data movement between multiple HPC systems and between HPC and Cloud object storage systems.

--------------------------------------

- Introduce **multi-tier data management** (memory, SSD, disk, object storage).
- Develop **asynchronous data movement and prefetching** mechanisms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Develop data access patterns to trigger proactive data movement.

- Develop **asynchronous data movement and prefetching** mechanisms.
- Enhance **seurity and authentication** (e.g., Cray DRC, token-based access).
- Support **federated PDC deployments** across distributed sites.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add one more: Develop evaluation and remedial strategies for readying data for AI, while data is in flight.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more: Integrate PDC in DOE applications and workflows.

**7.5.** Long-Term Vision (Beyond 2 Years)
------------------------------------------

- Enable **self-optimizing data placement** based on access patterns and system telemetry.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate; let's remove this.

@jeanbez jeanbez marked this pull request as ready for review November 14, 2025 22:42
@jeanbez jeanbez requested a review from a team as a code owner November 14, 2025 22:42
@jeanbez jeanbez changed the title Draft: Documentation rework Documentation rework Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority: high High priority type: documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants